Skip to content

fix(license): align SPDX parity and fail closed on malformed output#852

Merged
mstykow merged 1 commit intomainfrom
fix/spdx-lid-parity
May 6, 2026
Merged

fix(license): align SPDX parity and fail closed on malformed output#852
mstykow merged 1 commit intomainfrom
fix/spdx-lid-parity

Conversation

@mstykow
Copy link
Copy Markdown
Owner

@mstykow mstykow commented May 6, 2026

Summary

  • Canonicalize recovered SPDX-LID unknown expressions to LicenseRef-scancode-unknown-spdx so malformed raw text no longer leaks into SPDX output.
  • Narrowly merge same-line sandwiched GPL clue matches with exact neighboring detections so compare-outputs now matches ScanCode on src/license_detection/spdx_lid/mod.rs license detections and top-level SPDX expression deltas.
  • Harden scanner and output SPDX recombination to fail closed on malformed or partial SPDX strings, with focused regressions for SPDX-LID recovery, grouping, and top-level output shaping.

Scope and exclusions

  • Included: SPDX-LID recovery/output canonicalization, narrow detection-grouping parity fix, and strict SPDX recombination hardening in scanner/output paths.
  • Explicit exclusions: percentage_of_license_text parity and the remaining compare artifact noise in license_clues serialization (from_file / rule_url normalization).

Follow-up work

  • Created or intentionally deferred: a separate cleanup if we want raw compare artifacts to stop flagging clue-only path/URL normalization noise.

Signed-off-by: Maxim Stykow <maxim.stykow@gmail.com>
@mstykow mstykow merged commit 4679108 into main May 6, 2026
15 checks passed
@mstykow mstykow deleted the fix/spdx-lid-parity branch May 6, 2026 10:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant